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Abstract 

We prove that for positive semidefinite matrices A and B the following determinan¬ 
tal inequality holds: 

det(J + A#B) < det(J + A 1 > 2 B 1 / 2 ), 

where AffB is the geometric mean of A and B. We apply this inequality to the 
study of interpolation methods in diffusion tensor imaging. 


1 Introduction 


The topic of this paper has arisen in the study of interpolation methods for 
image processing in diffusion tensor imaging (DTI). DTI is an imaging method 
used in medical magnetic resonance imaging (MRI) whereby the data to be 
imaged usually consists of a field of 3 x 3 statistical covariance matrices D( r) e 
V, where the points r lie on a 3-dimensional grid. In order to improve the visual 
quality of the image it is necessary to interpolate and/or extrapolate between 
neighbouring D-values. As there is no unique way to do so, it is important to 
have a mathematical framework within which to describe the various methods 
as well as their measures of goodness. Such a framework has recently been 
discussed in [3f9], and the present work grew out of this. 

Email address: koenraad.audenaert@rhul.ac.uk (Koenraad M.R. Audenaert). 


Preprint submitted to Elsevier 


17 March 2015, 13:40 





The interpolation/extrapolation problem is easily formulated as follows. In 
this context, all covariance matrices are real, symmetric, positive semidefinite 
3x3 matrices. Let D l and D 2 be two covariance matrices. Construct a path 
p D(p), where [0,1] for interpolation or p < 0 or p > 1 for extrapolation, 
such that 11(0) = Hi, D{ 1) = D 2 and D(p ) > 0, for all p in the interval of 
interest, and whereby certain quality criteria have to be satisfied. Without 
going in too much detail, the quality criterion used in [9] is based on the 
cube root of the determinant of D(p) as this is one of the quantities that 
provides structural details of the sample being imaged. It is argued by some 
that interpolated/extrapolated values of this determinant should not be “too 
large”, as this might lead to so-called “swelling” of certain features in the 
reconstructed image. 

One further requirement on the path D(p) is that it be the shortest path 
between D x and D 2 , in a sense to be specified. This is easiest to satisfy by 
defining a suitable metric d(-, •) on V and let the path be a geodesic path. 
The requirement that all D{jp) on the path be positive semidefinite can then 
be enforced by choosing a metric specific to the curved space V ; this excludes 
the Euclidean metric d(A, B) = \\A — B\\ 2 for example as its geodesics do not 
stay within V. Still, many choices remain. Probably the most studied P-metric 
in matrix analysis is the Riemannian metric = || log(H 1 1 ^ 2 D 2 D 1 1//2 )|| 2 [21 
Chapter 6]. Accordingly, this metric has been given due consideration in DTI 
as well [4j. However, one of its drawbacks is that it is inordinately sensitive 
to very small eigenvalues of the covariance matrices. In particular, all rank- 
deficient matrices arc infinitely far apart in the Riemannian sense, no matter 
how close they are in the Euclidean sense. For this and other reasons, other 
metrics besides the Riemannian one are being studied for DTI. 

The metrics studied in the present work are the so-called “Euclidean root met¬ 
ric”, dn, and the “Procrustes size-and-shape metric” ds] we will not explain 
this nomenclature here. Both metrics are based on a reparameterisation of the 
covariance matrices to enforce positive semidefiniteness, in that they give the 
Euclidean distance between certain square roots of the covariance matrices, 

11 Si — jS 2 11 2 - Here, a square root of a positive semidefinite matrix D is any 
matrix S for which D = S*S. Taking for example the Cholesky decomposi¬ 
tion for S ( S being upper triangular with positive diagonal entries) yields the 
Cholesky metric dc■ One gets the Euclidean root metric du by taking positive 
square roots; i.e. Si = Qi := Dl/ 2 . 

The Procrustes metric is the minimal one in the sense that it minimises 115\ — 
S 2 II 2 over Ol possible choices of square roots. That is, 

d 5 (Hi, D 2 ) = min{||Qi - RQ 2 || 2 : R unitary}. 

R 

In other words, we look for that square root of D 2 that is closest to the 


2 


positive square root of D\. This minimisation can easily be done analytically, 
since \\Qi — RQ 2 \\\ = Tr(Qi + Q 2 ) — 2ReTr RQ 2 Q\- Hence, the optimal R 
in the above minimisation is the unitary matrix for which RQ 2 Qi = \Q 2 Qi\, 
where |A| = \JX*X denotes the modulus of the matrix X. Thus the optimal 
R is given by R — U* where U is the unitary factor in the polar decomposition 
of Q 2 Q\ = U\Q 2 Qi\. 

The geodesics induced by these metrics are obtained by considering linear 
paths in the square root space, and taking the square to map them back to 
V. That is, for the Euclidean root metric, 

Dh(p) = \pQi + (1 -_p)<?2| 2 


and for the Procrustes metric 

d s (p) = \pQi + (i-p)u*q 2 \ 2 . 


The question that we wanted to answer about these metrics is how the de¬ 
terminants of the interpolated values of the various D(p) behave. Our main 
result shows that for 0 < p < 1 the Procrustes path always produces deter¬ 
minants that are smaller or equal to those of the Euclidean root path, i.e. the 
Procrustes interpolation is less prone to swelling. For the extrapolation (p < 0 
or p > 1) numerical calculations confirmed our intuition that there can be no 
guaranteed ordering between the determinants of the two paths. 


2 Main Result 


The following theorem answers the question posed in the introduction; in fact 
it is more general than required as it holds for all complex positive semidefinite 
matrices, of any dimension. 

Theorem 1 Let Q 1 and Q 2 benxn positive semidefinite matrices, Let Q 2 Qi 
have polar decomposition Q 2 Q\ = U\Q 2 Qi\- Then det(Qi + U*Q 2 ) is real and 
non-negative and 

det(Qi + U*Q 2 ) < det(Qi + Q 2 )• 


Here we have absorbed the interpolation parameter p in Q\ and 1 — p in Q 2 . 
For 0 < p < 1 this does not change the signature of Q 1 or Q 2 , nor does it 
change the unitary factor U. 
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That det(Qi + U*Q 2 ) is real and non-negative is easy to show. Indeed, we have 
U*Q 2 Q 1 = IQ 2 Q 1 I) s ° that det(Qi + U*Q 2 ) det(Qi) — det(Qi + IQ 2 Q 1 I) >= 0. 
Dividing by the positive number det(Qi) shows that det(Qi + U*Q 2 ) > 0. 

The proof of the inequality of this theorem relies on a number of concepts 
from matrix analysis, which we introduce first in some detail (for the benefit 
of those readers working in diffusion tensor imaging). In Section [3] we consider 
the related concepts of majorisation and log-majorisation, presenting their 
definitions and their most important properties. In Section [4] we consider the 
matrix generalisation of the geometric mean. Finally, in Section 0 we present 
the proof of the theorem. 


3 Majorisation 


I 11 this section we consider vectors x = (aq,. .., x n ) and y = (y 1 ,, y n ) in 
M”. We denote by the vector consisting of the elements of x, sorted in 
non-ascending order. Thus, x\ is the fc-th largest element of x. 

We will now introduce several relations between x and y that come under the 
heading of majorisation. The standard work about the theory and applications 
of majorisation is undoubtedly [7], to which we refer for more details. A more 
concise treatment can be found in [fj Chapter II]. 

We say that x weakly majorises y, denoted y -< w x, if and only if, for all k, 
the sum of the k largest elements of x dominates the sum of the k largest 
elements of y. 

k k 

y <J2 x i, k — 1,2,... ,n. 

i=l i=l 


If in addition the sum of all elements of x equals the sum of all elements of y, 
then we say that x majorises y: 


y -< x 


ELi4<ELi4, k = 1,2, 

V s n ,,4- _ sxn 4- 

2^i=i Ui ~ l^i=i L i ■ 


,n — 1; 


A closely related concept is log-majorisation, which concerns vectors in M". We 
say that x weakly log-majorises y, denoted y -^log x if and only if logy -< w 
logx. Thus, 

k k 

y -< u ,,iog x Y[ yj < ]Q 4, k = 1, 2,..., n. 

i=l i =1 
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Likewise, x log-majorises y, denoted y -q og x if and only if logy -< logx. 

Next we discuss which functions preserve the (weak) majorisation ordering. 
A function $ : M” — y M m is called strongly isotone if and only if it preserves 
weak majorisation: 

y -< w x => $(y) -< w <L(x). 

A function is called isotone if and only if 
y -< x => <L(y) -< w $(x). 

We will need the following characterisation of isotony in the case that m = 1 
[Ij, Theorem II.3.14], 

Lemma 1 (Schur) A differentiable function $ : M" —)■ R. is isotone if and 
only if it satisfies 

(1) $ is permutation invariant, 

(2) for all xeK" and for all i,j: 



4 Geometric Mean 

Recall that the geometric mean of two positive real numbers x and y is given 
by y/xy . As is the case with all means, when one wishes to generalise the 
geometric mean to two positive semidefinite matrices A and B , one is faced 
with the usual problem of non-commutativity: there exists an infinite number 
of expressions involving A and B that reduce to \/ AB when A and B commute. 
For example, \J AB and \/-A\rB are in general different matrices, but both are 
equal in the commutative case. 

To resolve this problem, one has to impose a number of conditions in order to 
obtain a unique generalisation. Kubo and Ando [6] have developed a very nice 
theory of matrix means that does exactly that. It is now standard to define 
the matrix geometric mean as follows: 

Definition 1 Let A and B be n x n positive definite matrices. Then their 
geometric mean, denoted AffB, is defined as 


A#B := A 1/2 {A- 1/2 BA~ 1/2 ) 1/2 A 1/2 . 


5 





When A and/or B are rank-deficient, the geometric mean is defined via a 
limiting procedure. 


While this is not obvious from the definition, the geometric mean is symmetric 
in its arguments; that is, Af^B = B#A. 

Clearly, for a, b > 0 we have ( aA)jf(bB ) = \/ab(A#B). 

One can show that Afj^B emerges as the solution of the following optimisation 
problem: the set of positive semidefinite matrices X for which the block matrix 



is itself positive semidefinite, has a maximum in the positive semidefinite or¬ 
dering, and this maximum is exactly AffB. 

In the present work, we will need two more properties of the geometric mean: 
Lemma 2 (Monotonicity) Let 0 < A and 0 < B\ < B 2 . Then 


a#b x < a#b 2 . 


Lemma 3 Let A, B > 0 and r > 1. If Af^B < 1, then A r #B r < 1. 


The proof of this last statement can be found in [5], as part of the proof of its 
Theorem 4.6.9. 


5 Proof of the theorem. 

We start with a lemma relating the eigenvalues of the matrix geometric mean 
to the eigenvalues of what could be designated as a “naive” matrix geometric 
mean. This is actually a special case of a more general inequality due to 
Matharu and Aujla [SJ Theorem 2.10], but we provide a stand-alone proof for 
the benefit of the reader. Henceforth we will denote the vector of eigenvalues 
of a matrix X by A(X). 

Lemma 4 Let A, B > 0. Then 


\{A#B) ^ log \{y/A VB). 
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Proof. Throughout the proof we assume that A is invertible. The general case 
follows from continuity considerations. 

We first show that the inequality holds for the largest eigenvalue Ap that is: 
\i(A#B) < X 1 (y/A y/B). ( 1 ) 


Let a = Ai(vC4 y/B). Thus B 1 ^ 2 < aA~ 1 / 2 . By monotonicity of the geometric 
mean, we then have AA^jfB 1 ! 2 < yfa {A 1 / 2 ^A~ 1 / 2 ) = y/a. Using Lemma [3] 
with r = 2, we obtain AJfB < a. This says that Ai (AffB) < a, as required. 

To prove the statement of the lemma, we use the so-called “Weyl-trick”. Let 
A Ak be the fc-th antisymmetric tensor power of A; this is the restriction of the 
fc-th tensor power of A, A® k , to the totally antisymmetric subspace of (C n ) 0fc . 
The Weyl-trick exploits two facts about these powers. Firstly, for A > 0 the 
largest eigenvalue of A Ak is given by 

\ 1 (A Ak ) = *M)\2(A)---\ k (A). 


Secondly, any expression involving products and/or fractional matrix powers 
“commutes” with taking the k-th antisymmetric tensor power. In particular, 
(A#B) Ak = A Ak #B Ak . 

Thus, exploiting (JTJ) , we get 


fl A i(A#B) = X 1 ((AffB) Ak ) = A M Ak #B Ak ) 

i= 1 

k 

< X^Va/^Vb^) = A 1 ((v / I VB) Ak ) = n A i{VA Vb). 

i= 1 

This already shows that we have a weak log-majorisation: A (A#B) -< Wt i 0 g 

A {VA y/B). 

Because of the Cauchy-Binet theorem, we also have 
det (AffB) = y /, det(A) det (7?) = det(y/Ay/B), 

which says that Y\i =1 X/AffB) = n/=i X/y/A y/B). Thus, the weak log- 
majorisation relation can be strengthened to a log-majorisation. □ 

Theorem 2 Let A, B > 0. Then 

det(/ + A#B) < det(7 + A 1/2 B 1/2 ). (2) 
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Proof. Note that, for any X > 0, we have logdet X = Tr log X = Yfi=\ log A;(A"). 
So we equivalently need to show 

Tr log(/ + A#B) < Tr log(J + A 1/2 B 1/2 ). (3) 


By Lemma HI 

log A (A#B) -< log A (A^B 1 ' 2 ). 


This implies (J3]) if we can show that the function $(x) := 1 log(l + exp(xj)) 

is isotone. 

Clearly, this function is permutation symmetric. Furthermore, 
d^/dxi = exp(ay)/(l + exp(xi)) = 1/(1 + exp(-Xi)), 


which is monotonously increasing in x t . Hence, the second condition for isotony 
is also satisfied. □ 

Proof of Theorem 0 

Let us consider the matrices Q i, Q 2 and U given in the statement of the 
theorem. Thus t/*QoQi = \Q 2 Qi\- We will assume that Q 1 is invertible, so 
det(Qi) 7 ^ 0. The general case follows from continuity of the determinant. 

Because of the Cauchy-Binet theorem, the statement of the theorem is equiv¬ 
alent with 

det((Qi + U*Q 2 )Qi ) < det(( 5 i(Qi + Q 2 ))) 


which becomes 

det+ \Q2Q1D 5: det(Qi + Q1Q2 )• 

Applying the Cauchy-Binet theorem a second time we can divide out Q\ from 
each side in a well-chosen way, and get the equivalent statement 

det (7 + Q^iQiQlQiY^Qf 1 ) < det(J + Qf 1/2 Q 2 Qf 1/2 ). 


Using the substitutions A = Q 1 2 and B = Q 2 , this can be rewritten as 
det(7 + A 1 ' 2 (A- 1 / 2 BA- 1 / 2 ) 1 / 2 A 1 ' 2 ) < det(7 + A 1/4 B 1 / 2 A 1/4 ), 


or, in terms of the geometric mean, 


det(/ + A#B) < det(/ + A 1/2 B 1/2 ). 


The validity of this inequality is just Theorem [2j □ 
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